Update Strategies for DBpedia Live

نویسندگان

  • Claus Stadler
  • Michael Martin
  • Jens Lehmann
  • Sebastian Hellmann
چکیده

Wikipedia is one of the largest public information spaces with a huge user community, which collaboratively works on the largest online encyclopedia. Their users add or edit up to 150 thousand wiki pages per day. The DBpedia project extracts RDF from Wikipedia and interlinks it with other knowledge bases. In the DBpedia live extraction mode, Wikipedia edits are instantly processed to update information in DBpedia. Due to the high number of edits and the growth of Wikipedia, the update process has to be very efficient and scalable. In this paper, we present different strategies to tackle this challenging problem and describe how we modified the DBpedia live extraction algorithm to work more efficiently.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DBpedia and the live extraction of structured data from Wikipedia

Purpose – DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the Web using Linked Data and SPARQL. However, the DBpedia release process is heavy-weight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live synchronization method based on the update stream...

متن کامل

Interest-Based RDF Update Propagation

Many LOD datasets, such as DBpedia and LinkedGeoData, are voluminous and process large amounts of requests from diverse applications. Many data products and services rely on full or partial local LOD replications to ensure faster querying and processing. While such replicas enhance the flexibility of information sharing and integration infrastructures, they also introduce data duplication with ...

متن کامل

A Spotlight on Spotlight

In this paper, we have made an attempt to take a look under the hood of Spotlight and understand some of the following questions amongst others: • How well-integrated is Spotlight with the file system? • How reliable is the interface between Spotlight and the file system? • Is live update a myth or a reality under heavy file system activity? Based on our experiments, it seems that Spotlight is ...

متن کامل

LOD2 Deliverable 3.2.2 DBpedia-Live Extraction

DBpedia is the Semantic Web mirror of Wikipedia. Wikipedia users constantly revise Wikipedia articles almost each second. Hence, data stored in DBpedia triplestore can quickly become outdated, and Wikipedia articles need to be re-extracted. DBpedia-Live, the result of this deliverable, enables such a continuous synchronization between DBpedia and Wikipedia. The information in this document refl...

متن کامل

One Knowledge Graph to Rule Them All? Analyzing the Differences Between DBpedia, YAGO, Wikidata & co

Public Knowledge Graphs (KGs) on the Web are considered a valuable asset for developing intelligent applications. They contain general knowledge which can be used, e.g., for improving data analytics tools, text processing pipelines, or recommender systems. While the large players, e.g., DBpedia, YAGO, or Wikidata, are often considered similar in nature and coverage, there are, in fact, quite a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010